Testing Distributional Hypothesis in Patent Translation
نویسندگان
چکیده
This paper presents a wordlist-based lexical richness approach to testing distributional hypothesis for genre analysis in translation studies. In recent years, there has been continuing interest in patent translation. However, there are only a few lay their interests on comparison between native and non-native writing. The proposed approach to terms distrubution of technical words contained in United States Patent and Trademark Office (USPTO) and Japan Patent Office (JPO) in terms of lexical variation, lexical density and lexical sophistication, in brief, highlights distributional similarity of technical genre, and in particular, distibutional difference of academic and general genres.
منابع مشابه
Hypothesis Testing based Intrinsic Evaluation of Word Embeddings
We introduce the cross-match test an exact, distribution free, high-dimensional hypothesis test as an intrinsic evaluation metric for word embeddings. We show that cross-match is an effective means of measuring distributional similarity between different vector representations and of evaluating the statistical significance of different vector embedding models. Additionally, we find that cross-m...
متن کاملTesting the Distributional Hypothesis 1 Running head: TESTING THE DISTRIBUTIONAL HYPOTHESIS Testing the Distributional Hypothesis: The Influence of Context on Judgments of Semantic Similarity
Distributional information has recently been implicated as playing an important role in several aspects of language ability. Learning the meaning of a word is thought to be dependent, at least in part, on exposure to the word in its linguistic contexts of use. In two experiments, we manipulated subjects’ contextual experience with marginally familiar and nonce words. Results showed that similar...
متن کاملAutomatic Translation of Scholarly Terms into Patent Terms Using Synonym Extraction Techniques
Retrieving research papers and patents is important for any researcher assessing the scope of a field with high industrial relevance. However, the terms used in patents are often more abstract or creative than those used in research papers, because they are intended to widen the scope of claims. Therefore, a method is required for translating scholarly terms into patent terms. In this paper, we...
متن کاملNTCIR-7 Experiments in Patent Translation based on Open Source Statistical Machine Translation Tools
This paper describes our experiment methods and results in the NTCIR-7 Patent Translation Task [1]. As the first step of our research in machine translation, we integrated a series of open source software to build a statistical translation model. The experiment results demonstrated that we still need to improve the performance and efficiency in both model training and testing.
متن کاملEfficient Neural-based patent document segmentation with Term Order Probabilities
The internationally growing trend of patent applications puts great pressure on the agents involved in managing this kind of information and creates a demand for efficient and effective patent analysis methods. This work presents a computationally efficient approach for patent document segmentation based on structured ANNs and a simple distributional semantics composition method. The conducted ...
متن کامل